From Revisiting the LCA-based Approach to a New Semantics-based Approach for XML Keyword Search
نویسندگان
چکیده
Most keyword search approaches for data-centric XML documents are based on the computation of Lowest Common Ancestors (LCA), such as SLCA and MLCA. In this paper, we show that the LCA is not always a correct search model for processing keyword queries over general XML data. In particular, when an XML database contains relationships among objects, which is quite common in practical data, LCA-based search may not be able to find desired answers for many keyword queries. We propose to use semantics instead of the structure of XML data to perform keyword search, and show that the semantics-based search can solve the problems of the LCA-based approach. To the best of our knowledge, this is the first work to point out serious problems of the LCA-based XML keyword search approach, and propose an approach to perform XML keyword search based on semantics rather than the hierarchical structure of XML data to address those problems.
منابع مشابه
Schema-Independence in XML Keyword Search
XML keyword search has attracted a lot of interests with typical search based on lowest common ancestor (LCA). However, in this paper, we show that meaningful answers can be found beyond LCA and should be independent from schema designs of the same data content. Therefore, we propose a new semantics, called CR (Common Relative), which not only can find more answers beyond LCA, but the returned ...
متن کاملKeyword Search in Bibliographic XML Data
Keyword search is a user-friendly way to query text, HTML, XML documents and even relational databases. The previous well-known semantic of LCA (Lowest Common Ancestor) is used for XML keyword search based on tree model. However, LCA cannot exploit the information in ID references, thus may return a large tree containing irrelevant results. Another keyword search approach based on general digra...
متن کاملProcessamento de consultas baseadas em palavras-chave sobre fluxos XML
XML streams have become a relevant research topic due to the widespread use of applications such as online news, RSS feeds, and dissemination systems. Such streams must be processed rapidly and without retention. Retaining streams could cause data loss due to the large data traffic in continuous processing. This context becomes more complex when thousands of queries must be evaluated simultaneo...
متن کاملObject Semantics for XML Keyword Search
It is well known that some XML elements correspond to objects (in the sense of object-orientation) and others do not. The question we consider in this paper is what benefits we can derive from paying attention to such object semantics, particularly for the problem of keyword queries. Keyword queries against XML data have been studied extensively in recent years, with several lowest-common-ances...
متن کاملFrom Structure-Based to Semantics-Based: Towards Effective XML Keyword Search
Existing XML keyword search approaches can be categorized into tree-based search and graph-based search. Both of them are structure-based search because they mainly rely on the exploration of the structural features of document. Those structure-based approaches cannot fully exploit hidden semantics in XML document. This causes serious problems in processing some class of keyword queries. In thi...
متن کامل